Search CORE

146 research outputs found

The MEROPS Database

Author: Alan J. Barrett
Neil D. Rawlings
Publication venue
Publication date: 20/04/2009
Field of study

Many proteins undergo important post-translational proteolytic processing to remove targeting signals and activation peptides, and most proteins undergo proteolytic inactivation and catabolism. The enzymes that hydrolyse the peptide bonds in proteins and peptides are known as peptidases, proteases or proteolytic enzymes. The MEROPS database ("http://merops.sanger.ac.uk":http://merops.sanger.ac.uk) presents the classification and nomenclature of peptidases, their inhibitors and substrates. In 1993 we proposed the scheme for the classification of peptidases that has been internationally accepted, and in 1996 we established the MEROPS database. Protein inhibitors have been included in the database since 2004. About 2% of the genes in a genome encode peptidase homologues, and a further 1% encode protein inhibitors. For example, the human genome has 1037 genes encoding peptidase homologues (of which 643 are known or predicted to be active peptidases) and 433 protein inhibitor genes (of which 144 have been biochemically characterized as inhibitors). 

The MEROPS classification is hierarchical. Sequences are grouped into a peptidase species (each of which is given a unique identifier, for example C01.060 for cathepsin B); peptidase species are grouped into a family (for example C1); and families grouped into a clan (for example CA). To be included in the same protein species, sequences must be derived from the same node on a dendrogram derived from the family sequence alignment and known (or predicted) to share similar specificity. To be included in the same family sequences must be homologous over the sequence domain that contains the active site residues (peptidases) or reactive site (inhibitors). To be included in the same clan, the proteins must share similar tertiary structures (or the same linear arrangement of active site residues if the structure is unknown). Over 117,000 peptidase homologues are classified into 3114 protein species, 205 families and 52 clans, and 12,104 protein inhibitors are classified into 663 protein species, 64 families and 33 clans.

The database includes manually curated summaries for each clan, family and protein species. There are also sequence alignments and manually curated bibliographies (with over 41,000 references) at every level. In addition to protein inhibitors we also include 158 manually curated summaries for synthetic and naturally occurring small molecule inhibitors. There is also a summary page for each organism listing all known homologues and an analysis highlighting significant presences, absences or gene family expansions for organisms with a completely sequenced genome. 

The MEROPS database includes known peptidase substrates: naturally occurring peptides and proteins, and synthetic substrates. Currently there are 4091 cleavages of synthetic substrates and 95,413 cleavages of proteins (of which 74,740 are physiological). Cleavages in proteins are mapped to UniProt entries. An alignment of very close homologues of each substrate sequence is shown, highlighting residues around each cleavage site indicating whether the peptidase is known to accept the amino acid at that position or not. Cleavage sites that are conserved are likely to be physiological; cleavage sites that are not conserved may be pathological for the species in which they occur or coincidental.

The MEROPS data is freely available to download from our FTP site ("http://ftp.sanger.ac.uk/pub/MEROPS":http://ftp.sanger.ac.uk/pub/MEROPS) and via our Distributed Annotation System (DAS) server ("http://das.sanger.ac.uk/das/merops":http://das.sanger.ac.uk/das/merops).&#xa

Crossref

Nature Precedings

Bacterial calpains and the evolution of the calpain (C2) family of peptidases

Author: Neil D. Rawlings
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Springer - Publisher Connector

Pepsin homologues in bacteria

Author: Bateman Alex
Rawlings Neil D
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Peptidase family A1, to which pepsin belongs, had been assumed to be restricted to eukaryotes. The tertiary structure of pepsin shows two lobes with similar folds and it has been suggested that the gene has arisen from an ancient duplication and fusion event. The only sequence similarity between the lobes is restricted to the motif around the active site aspartate and a hydrophobic-hydrophobic-Gly motif. Together, these contribute to an essential structural feature known as a psi-loop. There is one such psi-loop in each lobe, and so each lobe presents an active Asp. The human immunodeficiency virus peptidase, retropepsin, from peptidase family A2 also has a similar fold but consists of one lobe only and has to dimerize to be active. All known members of family A1 show the bilobed structure, but it is unclear if the ancestor of family A1 was similar to an A2 peptidase, or if the ancestral retropepsin was derived from a half-pepsin gene. The presence of a pepsin homologue in a prokaryote might give insights into the evolution of the pepsin family. Results Homologues of the aspartic peptidase pepsin have been found in the completed genomic sequences from seven species of bacteria. The bacterial homologues, unlike those from eukaryotes, do not possess signal peptides, and would therefore be intracellular acting at neutral pH. The bacterial homologues have Thr218 replaced by Asp, a change which in renin has been shown to confer activity at neutral pH. No pepsin homologues could be detected in any archaean genome. Conclusion The peptidase family A1 is found in some species of bacteria as well as eukaryotes. The bacterial homologues fall into two groups, one from oceanic bacteria and one from plant symbionts. The bacterial homologues are all predicted to be intracellular proteins, unlike the eukaryotic enzymes. The bacterial homologues are bilobed like pepsin, implying that if no horizontal gene transfer has occurred the duplication and fusion event might be very ancient indeed, preceding the divergence of bacteria and eukaryotes. It is unclear whether all the bacterial homologues are derived from horizontal gene transfer, but those from the plant symbionts probably are. The homologues from oceanic bacteria are most closely related to memapsins (or BACE-1 and BACE-2), but are so divergent that they are close to the root of the phylogenetic tree and to the division of the A1 family into two subfamilies.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MEROPS: the peptidase database

Author: Barrett Alan J.
Morton Fraser R.
Rawlings Neil D.
Publication venue: Oxford University Press
Publication date: 28/12/2005
Field of study

Peptidases (proteolytic enzymes) and their natural, protein inhibitors are of great relevance to biology, medicine and biotechnology. The MEROPS database () aims to fulfil the need for an integrated source of information about these proteins. The organizational principle of the database is a hierarchical classification in which homologous sets of proteins of interest are grouped into families and the homologous families are grouped in clans. The most important addition to the database has been newly written, concise text annotations for each peptidase family. Other forms of information recently added include highlighting of active site residues (or the replacements that render some homologues inactive) in the sequence displays and BlastP search results, dynamically generated alignments and trees at the peptidase or inhibitor level, and a curated list of human and mouse homologues that have been experimentally characterized as active. A new way to display information at taxonomic levels higher than species has been devised. In the Literature pages, references have been flagged to draw attention to particularly ‘hot’ topics

Crossref

PubMed Central

A comparison of Pfam and MEROPS: Two databases, one comprehensive, and one specialised.

Author: Barrett Alan J
Bateman Alex
Rawlings Neil D
Studholme David J
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: We wished to compare two databases based on sequence similarity: one that aims to be comprehensive in its coverage of known sequences, and one that specialises in a relatively small subset of known sequences. One of the motivations behind this study was quality control. Pfam is a comprehensive collection of alignments and hidden Markov models representing families of proteins and domains. MEROPS is a catalogue and classification of enzymes with proteolytic activity (peptidases or proteases). These secondary databases are used by researchers worldwide, yet their contents are not peer reviewed. Therefore, we hoped that a systematic comparison of the contents of Pfam and MEROPS would highlight missing members and false-positives leading to improvements in quality of both databases. An additional reason for carrying out this study was to explore the extent of consensus in the definition of a protein family. RESULTS: About half (89 out of 174) of the peptidase families in MEROPS overlapped single Pfam families. A further 32 MEROPS families overlapped multiple Pfam families. Where possible, new Pfam families were built to represent most of the MEROPS families that did not overlap Pfam. When comparing the numbers of sequences found in the overlap between a MEROPS family and its corresponding Pfam family, in most cases the overlap was substantial (52 pairs of MEROPS and Pfam families had an intersection size of greater than 75% of the union) but there were some differences in the sets of sequences included in the MEROPS families versus the overlapping Pfam families. CONCLUSIONS: A number of the discrepancies between MEROPS families and their corresponding Pfam families arose from differences in the aims and philosophies of the two databases. Examination of some of the discrepancies highlighted additional members of families, which have subsequently been added in both Pfam and MEROPS. This has led to improvements in the quality of both databases. Overall there was a great deal of consensus between the databases in definitions of a protein family

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Research Exeter

Identification of the active site of legumain links it to caspases, clostripain and gingipains in a new clan of cysteine endopeptidases

Author: Barrett Alan J.
Chen Jinq-May
Rawlings Neil D.
Stevens Richard A.E.
Publication venue: Federation of European Biochemical Societies. Published by Elsevier B.V.
Publication date: 28/12/1998
Field of study

AbstractWe show by site-directed mutagenesis that the catalytic residues of mammalian legumain, a recently discovered lysosomal asparaginycysteine endopeptidase, form a catalytic dyad in the motif His-Gly-spacer-Ala-Cys. We note that the same motif is present in the caspases, aspartate-specific endopeptidases central to the process of apoptosis in animal cells, and also in the families of clostripain and gingipain which are arginyl/lysyl endopeptidases of pathogenic bacteria. We propose that the four families have similar protein folds, are evolutionarily related in clan CD, and have common characteristics including substrate specificities dominated by the interactions of the S1 subsite

Elsevier - Publisher Connector

A Primitive Enzyme for a Primitive Cell: The Protease Required for Excystation of Giardia

Author: Alvarado Lilia
Engel Juan C
Franklin Christopher
McKerrow James H
Rawlings Neil D
Ward Wendy
Publication venue: Cell Press.
Publication date: 02/05/1997
Field of study

AbstractProtozoan parasites of the genus Giardia are one of the earliest lineages of eukaryotic cells. To initiate infection, trophozoites emerge from a cyst in the host. Excystation is blocked by specific cysteine protease inhibitors. Using a biotinylated inhibitor, the target protease was identified and its corresponding gene cloned. The protease was localized to vesicles that release their contents just prior to excystation. The Giardia protease is the earliest known branch of the cathepsin B family. Its phylogeny confirms that the cathepsin B lineage evolved in primitive eukaryotic cells, prior to the divergence of plant and animal kingdoms, and underscores the diversity of cellular functions that this enzyme family facilitates

Elsevier - Publisher Connector

LUD, a new protein domain associated with lactate utilization.

Author: Aravind L
Axelrod Herbert L
Bakolitsa Constantina
Bateman Alex
Coggill Penelope C
Eberhardt Ruth Y
Godzik Adam
Hwang William C
Pascual Jaime
Peterson Scott N
Punta Marco
Rawlings Neil D
Sedova Mayya
Publication venue: eScholarship, University of California
Publication date: 01/11/2013
Field of study

BackgroundA novel highly conserved protein domain, DUF162 [Pfam: PF02589], can be mapped to two proteins: LutB and LutC. Both proteins are encoded by a highly conserved LutABC operon, which has been implicated in lactate utilization in bacteria. Based on our analysis of its sequence, structure, and recent experimental evidence reported by other groups, we hereby redefine DUF162 as the LUD domain family.ResultsJCSG solved the first crystal structure [PDB:2G40] from the LUD domain family: LutC protein, encoded by ORF DR_1909, of Deinococcus radiodurans. LutC shares features with domains in the functionally diverse ISOCOT superfamily. We have observed that the LUD domain has an increased abundance in the human gut microbiome.ConclusionsWe propose a model for the substrate and cofactor binding and regulation in LUD domain. The significance of LUD-containing proteins in the human gut microbiome, and the implication of lactate metabolism in the radiation-resistance of Deinococcus radiodurans are discussed

PubMed Central

eScholarship - University of California

New mini- zincin structures provide a minimal scaffold for members of this metallopeptidase superfamily

Author: Christine B Trame
Herbert L Axelrod
Marco Punta
Neil D Rawlings
Penelope Coggill
Ruth Y Eberhardt
Yuanyuan Chang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Springer - Publisher Connector

Structural Analysis of Papain-Like NlpC/P60 Superfamily Enzymes with a Circularly Permuted Topology Reveals Potential Lipid Binding Sites

Author: A Kumar
Adam Godzik
AE Cohen
AE Speers
Ashley M. Deacon
BD Santarsiero
BH Ha
C Sers
CH Pai
DW Cruickshank
DW Grosenbach
E Blanc
G Bricogne
GD Van Duyne
GN Murshudov
HE Klock
Heath E. Klock
Hsiu-Ju Chiu
Ian A. Wilson
IW Davis
J Pei
J von Lintig
JM Aramini
Joel L. Sussman
K Husmann
K Sugawara
KA Estes
L Xue
LM Iyer
Lukasz Jaroszewski
M-A Elsliger
Marc-Andre Elsliger
Mark W. Knuth
MD Resh
Mitchell D. Miller
MS Mondal
ND Rawlings
Neil D. Rawlings
P Emsley
Q Xu
Q Xu
Qingping Xu
SA Lesley
Scott A. Lesley
T Uyama
TC Terwilliger
TG Senkevich
TR Schneider
V Anantharaman
W Hanna-Rose
W Kabsch
WJ Jahng
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

NlpC/P60 superfamily papain-like enzymes play important roles in all kingdoms of life. Two members of this superfamily, LRAT-like and YaeF/YiiX-like families, were predicted to contain a catalytic domain that is circularly permuted such that the catalytic cysteine is located near the C-terminus, instead of at the N-terminus. These permuted enzymes are widespread in virus, pathogenic bacteria, and eukaryotes. We determined the crystal structure of a member of the YaeF/YiiX-like family from Bacillus cereus in complex with lysine. The structure, which adopts a ligand-induced, “closed” conformation, confirms the circular permutation of catalytic residues. A comparative analysis of other related protein structures within the NlpC/P60 superfamily is presented. Permutated NlpC/P60 enzymes contain a similar conserved core and arrangement of catalytic residues, including a Cys/His-containing triad and an additional conserved tyrosine. More surprisingly, permuted enzymes have a hydrophobic S1 binding pocket that is distinct from previously characterized enzymes in the family, indicative of novel substrate specificity. Further analysis of a structural homolog, YiiX (PDB 2if6) identified a fatty acid in the conserved hydrophobic pocket, thus providing additional insights into possible function of these novel enzymes

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California